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Abstract 



^*^ . Consider a stochastic process X„, n = 0, 1, 2, ...such that EX„ — > Xoo 

M ' as n — >■ oo. The sequence {Xn} may be a deterministic one, obtained 

C^_^ , by using a numerical integration scheme, or obtained from Monte-Carlo 

methods involving an approximation to an integral, or a Newton-Raphson 

iteration to approximate the root of an equation but we will assume that 

we can sample from the distribution of Xi, X2, ■■■Xm for finite m. We 

04 ' propose a scheme for unbiased estimation of the limiting value a; 00, to- 

^ ' gether with estimates of standard error and apply this to examples in- 

00 , eluding numerical integrals, root-finding and option pricing in a Heston 

^^ ' Stochastic Volatility model. 
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1 Introduction. 

Suppose Xn, n = 0, 1,2, ... is a stochastic process such that E{Xn) -^ x^o as 
n — >■ oo. Typicahy Xo = a:;o is a deterministic seed or arbitrary value initiating 
the iteration and we are interested in the hmiting value Xoo- The sequence 
{X„} may be deterministic, obtained by using a numerical integration scheme 
to approximate an integral, or a Newton-Raphson scheme to approximate the 
root of an equation. It may be a ratio estimator estimating a population ratio or 
the result of a stochastic or deterministic approximation to a root or maximum. 
In general we will only assume that it is possible to sample from the distribution 
of the stochastic process for a finite period, i.e. sample Xi, X2, ■■■Xm for fixed 
m. 

A common argument advanced in favour of the use Monte Carlo (MC) meth- 
ods as an alternative to numerical ones is that the MC estimator is usually 
unbiased with estimable variance. By increasing the sample size we are assured 
by unbiasedness that the estimator is consistent and we can produce, for any 
sample size, a standard error of the estimator. The statistical argument is 
advanced against the use of numerical methods that they do not offer easily 
obtained estimates of error. The purpose of this brief note is to show that this 
argument is flawed; generally any consistent sequence of estimators can be easily 
rendered unbiased and an error estimate is easily achieved. We do not attempt 
to merely reduce the bias, but by introducing randomization into the sequence, 
to completely eliminate it. The price we pay is an additional randomization 
inserted into the sequence and a possible increase in the mean squared error 
(MSE). 

2 The debiased sequence and its variance 

Suppose iV is a random variable, independent of the sequence {X^ n = 0, 1, 2, ...} 
taking finite non-negative integer values. Suppose Qn = P{N > n) > for all 
n = 1, 2, ... For a given sequence X„, n = 0,l,2,... we define the first backward 
difference as VX„ = Xn — Xn-i- Define the random variable 

n— 1 

^ ^ ^ I{n<N) 

n=l '^" 

This can be written in the more general form 

00 

Y = Xa + Y, V^nK (1) 

n=l 

where Fn,n ~ 1,2, ...are random variables with E[Fn\Xm Xn-i] — 1 and 
for some value of iV < 00 we have R = for i > N. We will show that 



Y is an unbiased estimator of the limit Xoc with easily estimated standard 
error. It is obviously unbiased provided that it is integrable and one can 
interchange the sum and the expected value since with J^ — cr(Xi,X2, ), 

Assume for the calculation of the variance that _E(X„ — Xqo)^ — )■ as n — )■ oo . 
Then 

a\ = var[Y) = E{var{J\T)\ + var{E{J\T)) 
= E\var{Y\T)\ + varixa^) 
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where E[VX^] = EXl - EX^_^ = a^ + ^2 „ ^2_^ _ ^2_^ ^ ^(^2 ^ ^2) ^^^^ 
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Then ay, as given in ([2]) can be unbiasedly estimated using 
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Suppose N > Us with probability one. Then the average over a large number 
of values of Y, i.e. a large number of values of N, {Xi} takes the form 

n=l 



where A'max is the largest observed value of N and VX„ denotes the average 
of the observed values of VX„ for which the corresponding N > n. This takes 
the form of ([T|) with term F„ = J'''^'?^"- — I obtained from simulating values of 
A'^ alone. Suppose we wish to minimize the variance subject to a constraint on 
the expected value of N, i.e. 
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Of course we also require that Qn is non-increasing and positive but for the 
moment we will ignore these additional constraints. Then we obtain, on differ- 
entiating the Lagrangian with respect to Q„ ,with ^„ — ^" 2"^^ j 



Qn - Cy/\2{^^^^C^^)WJI^^V^\ (4) 

where c is determined by the constraint ^^ Qn — ^J■ and the minimum variance 
is finite provided that 



Y, V|2 (^°o - 6") Va^, - VafI < 00. 
i=i 

While this is not entirely practical because it requires Xoo , it is common to have 
some information on the rate of convergence of the sequence that can be used 
to design an asymptotically appropriate sequence Qn ■ For example if we believe 
Xoo — Xn ^ ar"" for some r < 1 and a, then we might choose Qn ~ cr^^"^ or 
a random variable TV which has a geometric distribution, at least in the tails. 
Suppose the sequence Xn is deterministic and we use Qn ~ c^|Vx„|. Then 
the variance is finite provided the series ^^ y^VaJ^is convergent. 

Let us consider a simple example before we look at more complex ones. 
Suppose Xn = b + ar"- for n — l,...,|r| < 1 and Xoo ~ b. Then VX„ = 
ar"-i(r - l),n > 2 and VX^ ^ 2a6r"-i(r - 1) + a'^r'^^-'^(r'^ - l) and, as 



we already verified more generally, E{X) — E Xq + J2'^=i'^^n — q 



J(rt<JV) 



= b 



whatever the distribution of TV. Suppose we use a shifted geometric distribution 



for N so that P{N > n) = q"-'',n = s,s + 1, 
minimize the variance we should choose 



so that q 



Qn ~c|r|",n= 1, 
\r\. The variance for general q is 



for \q\ < 1. Evidently to 
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where 1 > q> r . 



Suppose we wish to minimize this over the values of s and q subject to the 
constraint that E{N) — t-^+s = /ijv is constant (ignoring the integer constraint 
on s). Then with z 
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minr ^^^^^ ^^ 

z 

which minimum occurs when z — 
minimum variance is 
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for — > z > 1 
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and then the 
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Notice that the mean squared error, if we were to stop after fi^ iterations, 
so we have purchased unbiasedness of the estimator at a cost of 
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increasing the MSE by a factor of approximately |r|~^-i''i . This factor is plotted 
in Figure [T] It can be interpreted as follows: in the worst case scenario when 
p is around .4, we will need about 3 times the sample size for the debiased 
estimator to achieve the same mean squared error as a conventional iteration 
using determinisitic N. However when \r\ is close to 1 indicating a slower rate 
of convergence, there is very little relative increase in the MSE. 

Note: the optimisation problem above tacitly assumed that the computation 
time required to generate the sequence is 0{n). This is not the case with some 
applications; for example in the numerical integral below the computation time 
is 0(2") since there are 2" intervals and 2" + 1 function evaluations and in 
this case a more appropriate minimization problem, having budget constraint 
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Figure 1: Relative increase in MSE due to debiasing the sequence. 
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which minimum appears to occur for s = [log2(c)] and q — 
r < 0.53 and otherwise s may be somewhat smaher. Intuitively, when the rate 
of convergence is reasonably fast ( so r is small) then the minimum variance is 
achieved by a large guarantee on the value of N (s large) and then the residual 
budget (c — 2*) used to produce unbiasedness by appropriate choice of q. 



3 Examples 

Example 1 Unbiased Estimation of a root 

Suppose we wish to find the root of a nonlinear function h. For a toy example, 
suppose we wish to solve for x the equation 

h{x) — a = 0. 

We might wish to use (modified) Newton's method with a random starting value 
to solve this problem, requiring randomly generating the initial value Xq and 
then iterating 

Xn+i ^ Xn- Sn where J„ = min(l, max(-l, " , )) 

but of course after a finite number of steps, the current estimate Xn is likely a 
biased estimator of the true value of the root. We implemented the debiasing 
procedure above with h{x) ~ x^ and a = l. We generated Xq from a f7(— 2,3) 
distribution, chose P{N = n) = (1 — p)p"^'', n = s, s + 1, ...and for simplicity 
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used s = 4,p = I and repeated for 100,000 simulations. The sample mean 
of the estimates was 1.0023 and the sample variance 0.011. Although the 
procedure works well in this case when we start sufficiently close to the root, it 
should be noted that this example argues for an adaptive choice of Qn, one which 
permits a larger number of iterations (larger values of N) when the sequence 
seems to indicate that we have not yet converged. This is discussed below. 

Stopping times for N. 

In view of the last example, particularly if Xo is far from x^o, it would appear 
desirable to allow A'^ to be a stopping time. In order to retain unbiasedness, it 
is sufficient that 
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Therefore it is sufficient that Q„ = P(N > n\Xi,X2, ■■■Xn) and one simple rule 
for an adaptive construction of A is: 



P(7V>n|Ai,A2,...A„) = 



P{N>n-l\Xi,X2,...X^-i) if VX„>e 
pP(7V>n-l|Ai,A2,...A„_i) if VA„<e 



There are, of course, many potential more powerful rules for determining the 
shift in the distribution of A but we we concentrate here on establishing the 
properties of the simplest version of this procedure. 

Example 2 Simpson's rule. 

Consider using a trapezoidal rule for estimating the integral 



f{x)dx 



using 2"+l function evaluations which evaluate the function on the grid 0, Ax, 2x, ...2"Aa; 

1. Denote the estimate of the integral /„. Here Ax — 2^" and the error 

in Simpson's rule assuming that the function has bounded fourth derivative 

is 0{{Ax)'^) = 0((2-")'*) = 0(16""). This suggests a random A^ such that 

Qn ^ 7?r or a (possibly shifted) geometric distribution with p — ^. Suppose 

Qn — 4""+^,n = 2,3,... This means E{N) = | which is quite smaU. In 

general, the estimator has finite variance since 
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V/„ 



< 00. 



More generally, if N has a shifted geometric distribution with probability func- 
tion P{N — n) — p{l — p)"~*, n = s,s + l, ... parameter p, the expected number 
of function evaluations in the quadrature rule is 

oo oo 

^(2" + 1)P{N = n) = ^(2" + 1)(1 -pr-'p 
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and this is 7, for example, when p = |, s = 2. How well does this perform? This 
provides an unbiased estimator of the integral with variance 

^^ v/„(i-o. 
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which can be evaluated or estimated in particular examples and compared with 
the variance of the corresponding crude Monte Carlo estimator. For a reasonable 
comparison, the latter should have the same (expected) number of function 
evaluations, i.e. 7 and therefore has variance 



f{x)dx-ll 



Take, for example, the function f{x) =■ sin(7ra;) so that /oo ~ /q sin(7ra;)(ia; = 
;| — 0.63662 and J„ (sin(7rx))^(ia; = i. In this case the variance of the MC 
estimator with seven function evaluations is ^ (0.5— (^) ) ~ 0.013531. We 
compare this with the estimator obtained by randomizing the number of points 
in a Simpson's rule. Here it is easy to check that 
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0.3047 


0.2149 


0.2124 


0.2122 


0.2122 


0.2122 


v/„ 


-0.089821 


-0.002562 


-0.000139 


-0.000008 


-0.000001 


-0.00000003 



Table 1: Values of the numerical integral /„ and V/„ with 2" intervals 



and in this case the variance of the debiased Simpson's rule estimate is a\ ~ 
6.41 X 10^® indicating more than a two thousand-fold gain in efficiency over 
crude Monte Carlo. 

Note: We have chosen the grid size 2^" in view of the fact that when N — n, 
we need the integrals Ij for all j < n. In this case, we can simply augment the 
function evaluations we used for Ij in order to obtain Ij+i. 

The major advantage of this debiasing procedure however is not as a replace- 
ment for Crude Monte Carlo in cases where unbiased estimators exist, but as 
a device for creating unbiased estimators when their construction is not at all 
obvious. This is the case whenever the function of interest is a nonlinear func- 
tion of variables that can be easily and unbiasedly estimated as in the following 
example. 



Example 3 Heston Stochastic Volatility model 

In the Hcstoii stochastic volatihty model, under the risk neutral measure Q, 
the price of an asset St and the volatility process Vt are governed by the pair of 
stochastic differential equations 



dSt = rStdt + y%StpdWi (t) + v/1 - P^dW2 (t) , 5*0-50 
dVt:^ Kie-Vt)dt + avVVtdWi{t), Vo = vq 

where Wi{t) and W2(t) are independent Brownian motion processes, r is the 
interest rate, p is the correlation between the Brownian motions driving the 
asset price and the volatility process, 9 is the long-run level of volatility and k 
is a parameter governing the Denote by BS{So,K,r,T,a-) the Black-Scholes 
price of a call option having initial stock value 5*0, volatility ct, interest rate r, 
expiration time T, option stike price K and dividend yield. The price of 
a call option in the Heston model can be written as an expected value under 
the risk-neutral measure Q of a function of two variables g{VT,I(T)) (see for 
example Willard (1997) and Broadie and Kaya (2006)) 



e-''^E'^{ST-K)+ - E[BS{So^, K, r, T, ay^l - p^)] where 
e = C(^T,/(T)) = exp(-^/(T) + pj ^dWM) 

2 

= cxp(-^/(r) + -{yT - Vo + k/(T) - kBT)) and 
2 a 



G - a(/(T)) = y ffl where /(T) = / V,ds. 

This can be valued conditionally on Vt-,1{T) with the usual Black-Scholes for- 
mula. In particular with ^(Vt, 1^)) — BS{So^, K, r, T, ay/l — p^), the option 
price is £;Q5(Vt,/(T)). 

Note that g is clearly a highly nonlinear function of Vt and I{T) and so, 
even if exact simulations of the latter were available, it is not clear how to 
obtain an unbiased simulation of g. In the Heston model, and indeed various 
other stochastic volatility models, it is possible to obtain an exact simulation 
of the value of the process Vt at finitely many values of t so it is possible to 
approximate the integral I{T) using In{T) obtained from a trapezoidal rule 
with 1-1-2" points. This raises the question of what we should choose as a 
distribution for N. Under conditions on the continuity of the functional of the 
process whose expected value is sought, Kloeden and Platen (1995, Theorem 
14.1.5, page 460) show that the Euler approximation to the process with interval 
size 2~" results in an error in the expected value of order 2~"-^ where x = 1 
for sufficiently smooth (four times continuously differentiable) drift and diffusion 
coefficients so for simplicity consider this case. This implies that 

\Eg{VTjn{T)) - Eg{VTj{T))\ < constant x 2"". 



which suggests we choose Q„ ^ 2~". 

As before we randomly generate N from a (possibly shifted) geometric (p) 
distribution with p — ^. The function to be integrated V^ is not twice differ- 
entiable so we need to determine empirically the amount of the shift (and we 
experimented with reasonable values oi p). We chose parameters p = 0.5 and 
shifted the geometric random variable by s = 2 so that P{N — n) ~ p{l —pY''^^ 
for n — s, s + 1,.... The parameters used in our simulation were taken from 
Broadie and Kaya(2004): so = 100; K = 100; Vb = 0.09; k = 2.00; 9 = 0.09; r = 
.05; a = 1; /9 = — .3; T = 5; for which, according to Broadie and Kaya, the true 
option price is around 34.9998. 1,000,000 simulations with p = 0.75 and s = 4 
provided an estimate of this option price of 34.9846 with a standard error of 
0.0107 so there is no evidence of bias in these simulations. With parameter val- 
ues e = 0.019; K = 6.21; a = 0.61; vq = 0.010201; r = 0.0319; so = 100; K = 100; 
T = 1; p = —0.7 and p = 0.75 and s = 4, we conducted 10^ simulations leading 
to an estimate of 6.8115 with a standard error of 0.0048998. This is in agree- 
ment with the Broadie and Kaya "true option price" of 6.801. Note that the 
Feller condition for positivity requires 2k9 > a"^ which fails in the above cases. 
This means that the volatility process hits zero with probability one, and for 
some parameter values, it does so frequently which may call into question the 
value of this model with these parameter values. 100,000 simulations from these 
models used about 10-13 minutes running Matlab 5.0 on an Intel Core 2 Quad 
CPU @2.5 GHz. 

4 Conclusion 

When numerical methods such as quadrature or numerical solutions to equations 
may result in a biased estimator, a procedure is suggested which eliminates this 
bias and provides statistical estimates of error. This procedure is successfully 
implemented both in simple root finding problems and in more complicated 
problems in finance and has enormous potential for providing Monte Carlo 
extensions of numerical procedures which allow unbiased estimates and error 
estimates. 
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